Day 17 - Regular expressions - Groups
91
I decompose the rest of the line using the year ([0-9]{4}:) as an anchor. Mentioning the closing
square bracket ] outside the group makes it disappear from the output. The HTTP protocol can be
either HTTP/1.0 or HTTP/1.1, and the HTTP status is always a three digits number ([0-9]{3}).
Go back to the exercise
Exercise 17.02
The file simple.log contains lines with requests concerning files like
83.149.9.216 [17/May/2015:10:05:03 GET /presentations/logstash-monitorama-2013/image\
s/kibana-search.png HTTP/1.1 200 203023 http://semicomplete.com/presentations/logsta\
sh-monitorama-2013/
Extract a list of all file extensions and count them. Assume that extensions are made of lowercase
letters only.
Solution
There are many ways to solve this exercise. One possible solution is to use lookaround expressions
with grep to isolate the file path and later the file extension.
$ cat simple.log | grep -Po "(?<=GET )\/.*(?= HTTP)" | grep -Po "(?<=\.)[a-z]+$" | s\
ort | uniq -c
4 access
10 c
14 conf
1 cpp
1458 css
4 deb
[...]
Go back to the exercise
Exercise 17.03
There are three lines in the file simple.log where a request received an HTTP 500 status code, for
example